Understanding the worker: Document processing and conversation management worker, part 1

worker.py is part of a chatbot application that processes user messages and documents. It uses the langchain library, which is a Python library for building conversational AI applications. It is responsible for setting up the language model, processing PDF documents into a format that can be used for conversation retrieval, and handling user prompts to generate responses based on the processed documents. Here's a high-level overview of the script:

Your task is to fill in the worker.py comments with the appropriate code.

Let's break down each section in the worker file.
The worker.py is designed to provide a conversational interface that can answer questions based on the contents of a given PDF document.

human analysis
The diagram illustrates the procedure of document processing and information retrieval, seamlessly integrating a large language model (LLM) to facilitate the task of question answering. The whole process happens in worker.py. image credit link.
  1. Initialization init_llm():

    • Setting environment variables: The environment variable for the HuggingFace API token is set.
    • Loading the language model: The WatsonX language model is initialized with specified parameters.
    • Loading embeddings: Embeddings are initialized using a pre-trained model.
  2. Document processing process_document(document_path):
    This function is responsible for processing a given PDF document.

    • Loading the document: The document is loaded using PyPDFLoader.
    • Splitting text: The document is split into smaller chunks using RecursiveCharacterTextSplitter.
    • Creating embeddings database: An embeddings database is created from the text chunks using Chroma.
    • Setting Up the RetrievalQA chain: A RetrievalQA chain is set up to facilitate the question-answering process. This chain uses the initialized language model and the embeddings database to answer questions based on the processed document.
  3. User prompt processing process_prompt(prompt):
    This function processes a user's prompt or question.

    • Receiving user prompt: The system receives a user prompt (question).
    • Querying the model: The model is queried using the retrieval chain, and it generates a response based on the processed document and previous chat history.
    • Updating chat history: The chat history is updated with the new prompt and response.

Delving into each section

IBM watsonX utilizes various language models, including Llama models by Meta, which have been among the strongest open-source language model published so far (in Feb 2024).

  1. Initialization init_llm():

This code is for setting up and using an AI language model, from IBM watsonX:

  1. Credentials setup: Initializes a dictionary with the service URL and an authentication token ("skills-network").

  2. Parameters configuration: Sets up model parameters like maximum token generation (256) and temperature (0.1, controlling randomness).

  3. Model initialization: Creates a model instance with a specific model_id, using the credentials and parameters defined above, and specifies "skills-network" as the project ID.

  4. Model usage: Initializes an interface (WatsonxLLM) with the configured model for interaction.

This script is specifically configured for a project or environment associated with the “skills-network”.

Complete the following code in worker.py by inserting the embeddings.

In this project, you do not need to specify your own Watsonx_API and Project_id. You can just specify project_id="skills-network" and leave Watsonx_API blank.

But it's important to note that this access method is exclusive to this Cloud IDE environment. If you are interested in using the model/API outside this environment (e.g., in a local environment), detailed instructions and further information are available in this tutorial.

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  1. # placeholder for Watsonx_API and Project_id incase you need to use the code outside this environment
  2. Watsonx_API = "Your WatsonX API"
  3. Project_id= "Your Project ID"
  4. # Function to initialize the language model and its embeddings
  5. def init_llm():
  6. global llm_hub, embeddings
  7. logger.info("Initializing WatsonxLLM and embeddings...")
  8. # Llama Model Configuration
  9. MODEL_ID = "meta-llama/llama-3-3-70b-instruct"
  10. WATSONX_URL = "https://us-south.ml.cloud.ibm.com"
  11. PROJECT_ID = "skills-network"
  12. # Use the same parameters as before:
  13. # MAX_NEW_TOKENS: 256, TEMPERATURE: 0.1
  14. model_parameters = {
  15. # "decoding_method": "greedy",
  16. "max_new_tokens": 256,
  17. "temperature": 0.1,
  18. }
  19. # Initialize Llama LLM using the updated WatsonxLLM API
  20. llm_hub = WatsonxLLM(
  21. model_id=MODEL_ID,
  22. url=WATSONX_URL,
  23. project_id=PROJECT_ID,
  24. params=model_parameters
  25. )
  26. logger.debug("WatsonxLLM initialized: %s", llm_hub)
  27. #Initialize embeddings using a pre-trained model to represent the text data.
  28. embeddings = # create object of Hugging Face Instruct Embeddings with (model_name, model_kwargs={"device": DEVICE} )
  29. logger.debug("Embeddings initialized with model device: %s", DEVICE)
Click here to see the solution
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  1. embeddings = HuggingFaceInstructEmbeddings(
  2. model_name="sentence-transformers/all-MiniLM-L6-v2",
  3. model_kwargs={"device": DEVICE}
  4. )
  5. logger.debug("Embeddings initialized with model device: %s", DEVICE)
  6. )